From Prompt Engineering to Auto Prompt Optimisation

Philipp Kaindl
24 min readOct 2, 2023

--

A case study for Marketing Content Generation

Authors: Philipp Kaindl, Tristan Nguyen

In today’s digital landscape, Artificial Intelligence (AI) is no longer a mere accessory in marketing; it’s becoming the core driver of innovative strategies. One AI technology that’s making a significant impact in this realm is Generative AI. This blog aims to guide you through the intricacies of using Prompt Engineering in conjunction with AWS Generative AI services such as Amazon Bedrock to generate marketing content that not only engages but also converts.

Prompt Engineering is a specialized field within AI that focuses on crafting effective prompts or instructions to guide AI models in generating desired outputs. This technology is particularly transformative for marketers as it allows creating tailored content at scale and more personalized audience engagement.

Automatic Prompt Optimization employing Large Language Models (LLMs) as optimizers in automated prompt engineering is a pivotal advancement inspired by “Large Language Models as Optimizers” by Yang et al. We will be using an LLM as a prompt engineer and outline the obstacles that we will encounter when using LLMs as Prompt Optimizers. In this blog, we delve into leveraging this approach with AWS Generative AI services, focusing on the transformation it brings to marketing strategies by enabling the generation of more engaging and converting content.

What You Will Learn:

  • Prompt Engineering Basics: Understand the art and science behind crafting effective prompts.
  • Best Practices: Learn how to align Prompt Engineering with Amazon Generative AI Services for optimal results that is both engaging yet compliant.
  • Case Study: We will also walk you through a case study of how Prompt Engineering can be used to generate effective marketing content as well as how to be aware of common pitfalls and how to avoid them.
  • Automatic Prompt Engineering: We will go from a first outline of how automatic prompt optimization can work to using it and examining the outcomes. Furthermore, we will outline the current challenges of the method and how to overcome them.

What is Prompt Engineering?

Prompt Engineering is the practice of crafting text inputs to guide generative AI models, including text-to-text and text-to-image models, in producing specific outputs. This technique leverages in-context learning, which is the model’s capacity to adapt temporarily based on the given prompts. In-context learning is a characteristic that naturally arises in LLMs.

In the realm of text-to-text models, a prompt can take various forms. It could be a question like “What is Fermat’s Little Theorem?”, a directive such as “Compose a poem about autumn leaves,” or even feedback phrases like “too wordy” or “more formal.” Prompts can also be more elaborate, incorporating context, guidelines, and even sample data. For instance, using a few-shot learning approach, a prompt might include examples like “Maison → house, chat → cat, Chien →” to guide the model’s responses — in our example to translate from French to English.

For text-to-image or text-to-audio models, prompts usually describe the desired end product. For example, “a high-resolution image of an astronaut on horseback” or “slow-tempo Lo-fi electro chill music featuring natural sounds.” When using text-to-image models, the prompt can be fine-tuned by adding or removing words, or by adjusting their order to influence the subject, style, layout, and even lighting of the generated image.

Setting Up

For our endeavor, we will need to set up some fake information to work with.

Product information in a JSON file:

{
"products": [
{
"id": "L001",
"Name": "Unsecured Installment Loan",
"Title": "Lending Solutions for Accomplishing Your Goals:",
"Description": "An Unsecured Installment Loan from FNB1 can provide you with the borrowing power you need to meet your financial objectives. These loans are repaid over a set time frame (term) with a specific number of payments.",
"Key Features": [
"Borrow anywhere from USD 2500 - USD 25000",
"One-time, lump-sum distribution of loan proceeds",
"Fixed repayment schedule",
"Available with a fixed rate",
"Interest rate discount available when you set up automatic monthly payments from an FNB Deposit account",
"Flexible repayment terms"
],
"Key Benefits": [
"Immediate spending power",
"Flexible and convenient payment methods including by mail, in-branch, online or automatic debit from your deposit account"
],
"Great For": [
"Larger, one-time purchases",
"Home renovations",
"New appliances",
"Debt consolidation"
]
},
{
"id": "L002",
"Name": "Secured Installment Loan",
"Title": "Enjoy Real Flexibility:",
"Description": "A Secured Line of Credit from FNB1 gives you the flexibility to borrow up to your credit limit at any time. Whether you use your line for overdraft protection or just as a cushion for those unexpected expenses, a line of credit provides you the borrowing power you need to help meet life's financial demands.",
"Key Features": [
"Secured with an FNB Savings Account or Certificate of Deposit",
"Flexibility to borrow as needed, repay and borrow again up to your credit limits",
"Borrow up to USD 250000",
"Variable interest rate tied to the U.S. Prime Rates",
"Access your funds via check or online transfer"
],
"Great For": [
"Providing overdraft protection for your checking account",
"Covering unexpected expenses",
"Providing liquidity to bridge the gap between when your bills are due and when you get paid"
]
},
{
"id": "CC003",
"Name": "SmartRate Credit Card",
"Title": "Get More Savings with Lower Rates",
"Description": "With the SmartRate Card, you can get more savings with our lowest interest rate. You can use your card anywhere Visa is accepted to conduct everyday purchases. For added convenience, consolidate your monthly bills on one statement for one monthly payment.",
"Key Features": [
"No Annual Fee (Please review rates and fees in the pricing information link below for costs associated with this credit card)",
"Competitive Variable APR"
],
"Great For": [
"Everyday purchases, from groceries to gas to monthly bills",
"Consolidating your monthly expenses on one convenient monthly bill",
"Saving on interest every time you use your card to make a purchase"
]
}
]
}

Customer information:

Name: Emily
Age: 80
Channel: Email
Recommended Product: L001
Preferred Language: English

Prompt Engineering

In the next section, we will examine the different prompt engineering techniques undertaken.

Prompts For Different Communication Channels

For creating the different templates, we will add instructions on how we want to have our messages structured for our outbound communication. These instructions will be kept the same.

In the Prompt portion, we have provided instructions on the channel (email or SMS), style, the target customer and target product,

This is followed by the Output-Formatting-Instructions which outlines how the LLM outputs should be formatted to be picked up by AWS Communications Developer Services like Amazon Pinpoint or Amazon Simple Email Services.

The general structure is as follows:

{Optional: Model specific instructions}
{Prompt}
{Output-Formating-Instructions}
{Optional: Model specific instructions}

Email

{Prompt}

Given the above details, generate 3 email parts in the specified format:

Subject: Subject of the email
HTML Body: Content of the email but formatted nicely in HTML
Text Body: Same content of the email formatted in plaintext

Format:
The returned string should be constructed as follows:
1. Start with the delimiter "###SUBJECT###" followed by the subject content, and then end with "###END###".
2. Next, start with the delimiter "###HTMLBODY###" followed by the HTML body content, and then end with "###END###". Make sure the generated HTML code has opening and ending <html> tags.
3. Finally, start with the delimiter "###TEXTBODY###" followed by the text body content, and then end with "###END###".
4. Only output the text not any instructions.
5. Output language is {lang}

SMS

{Prompt}

Given the above details, generate content for an SMS message in the specified format:

Text Body: Content of the SMS message in plaintext

Format:
1. Start with the delimiter "###TEXTBODY###" followed by the SMS message content, and then end with "###END###".
2. Only output the text not any instructions.
3. Output language is {lang}!
4. Limit the text body content to 160 characters or less.

The above prompt approach with its format instruction points serves a few purposes. The reasons for the instruction 1–4 in order are:

  1. Delimiter instruction: We need to be explicit with the delimiter (e.g. “###TEXTBODY###”), and it also needs to be a sufficiently unique string to ensure that there won’t be instances that the actual marketing content generated by the LLMs is incorrectly spliced.
    Alternatively, we can make multiple calls to Amazon Bedrock for the Subject, HTML Body, and Body respectively. The drawback of such an approach is that context might be lost between calls, and there will be a higher chance that the model’s output will be inconsistent, e.g. the plaintext email body does not reflect the plain-text version of the HTML email body.
  2. Output part instruction: This is necessary as a consequence of the way the prompt is structured. Since we are asking the LLM models to role-play in a conversational context, there will be instances in which it will output instructions/conversational fillers.
  3. Language output instruction: The desired output language can be passed inside the prompt to specify the output language, which provides a convenient way to output marketing content according to the recipient’s preferred language. Do take note that out-of-the-box LLM models may not perform as well in languages other than English and would need fine-tuning.
  4. SMS Output limitation: For SMSes, since messages that are more than 160 characters will get broken up and incur additional costs for sending, we can attempt to impose a character limit to the generated message.
    At the core, general-purpose LLMs are not tuned explicitly for mathematical or arithmetic operations. Therefore, the character count result can often be inaccurate. However, we can provision for this problem, by drafting a correction prompt whenever we exceed the character limit. A correction prompt could be as simple as: “{LLM_generated_SMS}\n The above-created SMS is too long, as we can only send 160 characters. The SMS is {SMS_char_length}, which is a factor of {SMS_char_length/160} too long. Shorten it to fit in one SMS.”

For example, an un-tuned model can give the following result for the SMS channel when supplied with the outlined prompt.

###TEXTBODY###
Dear Emily, this is John from FNB1.
I know you've worked hard over the years.
Let me tell you about our Unsecured Installment Loan that can help make your dreams come true.
With fixed rates and flexible terms, you can get the money you need for that dream vacation.
Don't wait, call me today and I'll help you get started.
###END###

Some problems with that SMS:

  1. Excluding the delimiters, the message itself is 322 characters.
  2. Therefore, it could be necessary to tune the model based on compliant text messaging contents sent out in the past, which will give a higher likelihood the model will produce compliant SMS suggestions.
  3. Alternatively, you can insert guardrails as proposed by the Prompt — Iteration 3 section.
  4. Lastly, it’s best practice to have an automated content quality/content moderation check for outgoing marketing materials. A simple character count to reject messages that are too long would work here, alternatively, with the correction prompt approach outlined above.

Prompt - Iteration 1

In the first iteration, we are starting with a prompt template, instructing the model to write a high-converting email or SMS. We will be incorporating several techniques that are common in marketing into the communication, such as:

  • power of storytelling
  • social proof
  • persuasive language

As we want to generate one template that we can scale to all our customer base, we substitute the marketing channel and user attributes like name and age.

You are a banking advisor named John Smith from First National Bank. Write a high-converting {channel} that will leverage the power of storytelling, social proof and persuasive language to captivate the customer {name} — aged {age} years old and drive them to buy {product_name}.
The product description is: {product_description}
The key features are: {product_key_features}
The product is great for: {product_great_for}

The inference settings for the experiments were set as outlined below, and we would like to point the interested reader to the following resources for an in-depth explanation of the LLM settings temperature and top_p.

  • max_tokens_to_sample = 4096
  • temperature = 0.5
  • top_p0.9

With all these instructions, the model acts like a salesman and starts hallucinating and making up stories to capture the customer.

Results — Iteration 1

###SUBJECT###
Emily, let me tell you about our Unsecured Installment Loan
###END###

###HTMLBODY###
<html>
<head>
</head>
<body>
<p>Dear Emily,</p>

<p>My name is John Smith and I'm an advisor at First National Bank.
I wanted to tell you about our Unsecured Installment Loan which I
think could be a great option for you.</p>

<p>I know you've been looking to finance that new kitchen renovation
you told me about last week. An Unsecured Installment Loan is perfect
for bigger, one-time expenses like home renovations. Here's how it works:</p>

<ul>
<li>Borrow anywhere from USD 2500 - USD 25000 for your project</li>
<li>Get the money all at once in a lump-sum when your loan is approved</li>
<li>Repay on a fixed schedule with a set number of payments</li>
<li>Lock in a low fixed interest rate</li>
<li>Save even more when you set up automatic payments from your FNB account</li>
<li>Choose flexible terms that fit your budget - 2 to 7 years</li>
</ul>

<p>I think this loan would be ideal to get the cash you need to renovate your
kitchen just how you envisioned. Many of our customers have used it for
similar home projects and were thrilled with the results.</p>

<p>Let's schedule some time to go over the details and get you pre-approved.
I'm excited to help make your kitchen dreams a reality!</p>

<p>Sincerely,</p>

<p>John Smith</p>

</body>
</html>
###END###

###TEXTBODY###
Dear Emily,

My name is John Smith and I'm an advisor at First National Bank. I wanted to
tell you about our Unsecured Installment Loan which I think could be a great
option for you.

I know you've been looking to finance that new kitchen renovation you told
me about last week. An Unsecured Installment Loan is perfect for bigger,
one-time expenses like home renovations. Here's how it works:

- Borrow anywhere from USD 2500 - USD 25000 for your project
- Get the money all at once in a lump-sum when your loan is approved
- Repay on a fixed schedule with a set number of payments
- Lock in a low fixed interest rate
- Save even more when you set up automatic payments from your FNB account
- Choose flexible terms that fit your budget - 2 to 7 years

I think this loan would be ideal to get the cash you need to renovate your
kitchen just how you envisioned. Many of our customers have used it for
similar home projects and were thrilled with the results.

Let's schedule some time to go over the details and get you pre-approved.
I'm excited to help make your kitchen dreams a reality!

Sincerely,

John Smith
###END###

Challenges in Promoting Responsible AI Deployment

In the context of deploying machine learning models, we encounter several critical challenges that warrant attention for a seamless production deployment.

  1. Ethical Content Generation: One significant challenge arises from the model’s attempt to boost conversion rates through the generation of fictitious narratives or by assuming an undue familiarity with the customer. For instance, it may produce statements like, “Looking to finance a new kitchen renovation,” which can come across as inauthentic and resemble the tactics of untrustworthy salespeople.
  2. Hallucinatory Responses: The model’s behavior sometimes extends to hallucinating entire conversations with customers, such as, “I know you’ve been looking to finance that new kitchen renovation you told me about last week.” This not only raises concerns about data privacy but also risks misleading customers.
  3. Unprompted Social Proof: Additionally, the model may attempt to strengthen its recommendations by citing social proof, like, “Many of our customers have used it for similar home projects and were thrilled with the results.” It’s important to note that the model has no access to information regarding recent conversations, emails, or call scripts involving the customer.
  4. Potential for Personalization: While our current prompts do not supply specific customer information, it’s worth highlighting that companies with a comprehensive understanding of their customers, acquired through various touchpoints, can harness this knowledge to personalize messages more effectively. By incorporating insights from customer interactions and lifecycle data, a more tailored and meaningful AI-driven communication strategy can be developed.

In the pursuit of responsible AI deployment, addressing these challenges is essential to ensure that AI models enhance customer experiences without compromising trust and ethical standards.

Prompt — Iteration 2

On the second prompt, we will be trying to address the challenges of the prompt version 1 as well as give some more information on the sender, so that the email can be tailored to it:

You are an marketing content creator assistant for bank advisors for “First National Bank”.
You are assting John Smith, a 54 years old bank advisor working for First National Bank for 20 years.
As a respectable bank, we need to keep our marketing content factual, as the banks reputation is at risk when making up stories.

Create a marketing {channel} to sell buy {product_name} for customer {name} who is {age} years old.
Adjust the language depending on the age of the customer and the typical needs for customers of that age.

The product description is: {product_description}
The key features are: {product_key_features}
The product is great for: {product_great_for}

The two prompts provided have some notable differences, particularly in terms of the context, tone, and approach they take to create marketing content for a bank advisor. Here’s a breakdown of the key differences between the two prompts:

Context and Persona:

  • Prompt V1: In this prompt, the writer assumes the persona of a banking advisor named John Smith and is tasked with creating high-converting marketing content that leverages storytelling, social proof, and persuasive language. The customer’s name and age are variables.
  • Prompt V2: In this prompt, the writer serves as a marketing content creator assistant for bank advisors at “First National Bank” and is assisting John Smith, an experienced bank advisor. The focus is on creating factual marketing content that aligns with the bank’s reputation, and the writer is instructed to adjust the language based on the customer’s age and typical needs.

Tone and Style:

  • Prompt V1: The tone is more focused on persuasion and captivation. It encourages the use of storytelling and persuasive language to sell the product.
  • Prompt V2: The tone is more factual and cautious. It emphasizes the importance of maintaining the bank’s reputation and advises against making up stories.

Content Approach:

  • Prompt V1: The content approach is more focused on creating a persuasive narrative that aims to captivate and persuade the customer to buy the product. It encourages the use of social proof and persuasive techniques.
  • Prompt V2: The content approach is centered on providing accurate and age-appropriate information to the customer. It requires adjusting the language based on the customer’s age and needs while avoiding fictional storytelling.

Persona’s Role:

  • Prompt V1: The writer assumes the role of a specific banking advisor named John Smith, but there is no explicit guidance on the writer’s experience or the bank’s reputation.
  • Prompt V2: The writer is positioned as an assistant to John Smith, emphasizing the importance of maintaining the bank’s reputation, and John Smith is an experienced bank advisor.

In summary, while both prompts involve creating marketing content for a bank advisor, Prompt V1 focuses more on persuasive storytelling and social proof, whereas Prompt V2 prioritizes factual and age-appropriate content while emphasizing the bank’s reputation.

Results — Iteration 2

###SUBJECT###
A loan to simplify your finances, Emily
###END###

###HTMLBODY###
<html>
<head>
<title>Loan Offer</title>
</head>
<body>
<p>Dear Emily,</p>

<p>As your trusted financial advisor at First National Bank,
I wanted to let you know about an excellent borrowing option
that may help simplify your finances - our Unsecured Installment Loan.</p>

<p>At your age, I know you may have some large upcoming expenses
for healthcare, home repairs or other needs. This loan provides
flexible borrowing up to $25,000, with fixed regular payments spread out
over a timeframe that works for your budget.</p>

<p>With our online application and fast approval process, the funds can be
deposited directly into your account within days. The fixed rate gives you
predictability, and the automated payments from your FNB account can even
save you money.</p>

<p>Please let me know if you would like to discuss this option in more
detail. I'm here to help provide solutions tailored for your financial
situation.</p>

<p>Sincerely,</p>
<p>John Smith</p>
<p>Financial Advisor</p>
<p>First National Bank</p>

</body>
</html>
###END###

###TEXTBODY###
Dear Emily,

As your trusted financial advisor at First National Bank, I wanted to let you
know about an excellent borrowing option that may help simplify your
finances - our Unsecured Installment Loan.

At your age, I know you may have some large upcoming expenses for healthcare,
home repairs or other needs. This loan provides flexible borrowing up to
$25,000, with fixed regular payments spread out over a timeframe that works
for your budget.

With our online application and fast approval process, the funds can be
deposited directly into your account within days. The fixed rate gives you
predictability, and the automated payments from your FNB account can even
save you money.

Please let me know if you would like to discuss this option in more detail.
I'm here to help provide solutions tailored for your financial situation.

Sincerely,

John Smith
Financial Advisor
First National Bank
###END###

Iteration 3 — Automatic prompt optimization

In the last version, we are going to use our model to improve the prompt itself. The inspiration for our approach is largely based on the ideas of the publication “Large Language Models as Optimizers” by Yang et al. at Deepmind. They outline using an LLM to improve a prompt, which will be evaluated against an objective function evaluator. This objective function evaluator, in essence, runs the new prompt generated by the LLM against a test set and calculates a performance metric on the test set. (e.g., rouge-score, accuracy,…).

Contrary to the paper, we do not have an objective value function that will serve as the input to our optimization problem. Therefore, we are going to utilize an LLM to self-evaluate itself. Therefore, we are moving ahead with a subjective function evaluator. However, in a real-world setting, we could measure, e.g. conversion rates of different prompt templates, and use this as feedback to the model.

When using a subjective function evaluator, such as an LLM, for the optimization problem, we face a few obstacles that we need to be aware of:

  • Bumpy response surface
    If we ask the model to evaluate its own output, we could observe across multiple runs that the model seems to be replying with discrete intervals, even though it has the whole spectrum between 0 and 100 and consistently attributes itself too high scores.
  • High confidence in its own score
    Starting out, we prompted the model to evaluate the answer on a score from 1–100. It consistently picked scores >75. Only after prompting it to be very strict, the model started lowering its score. The output of the model was further contained by introducing hard limits. For example, If the model is not staying factually correct, the score can maximum be 25.
  • Prompts not running successfully.
    The model is tasked to generate a new prompt format as well as personalize the marketing content for the end user. This can lead to new variables being introduced in the prompt, that cannot be satisfied when running the prompt template. One such example was the hallucination of the variable {bank_advisor_name}, which was not in the set of. Empirically, a positive correlation between the model breaching the variable set and the presence of a variable in the prompt template that was not marked as available could be observed.

Next, let’s examine the optimization loop we are going to use.

Optimisation Loop:

The optimization loop sketched out.

We start with the Executor LLM, executing the initial prompt that we provide. We are optimizing the prompt for this model. That output will be the input for the Evaluator LLM, which evaluates our prompt based on the guardrails and guidelines that we outline in the Evaluation Prompt and creates a recommendation to the downstream Improvement LLM. The Improvement LLM takes all that information in and crafts a new prompt, which kicks off a fresh cycle.

Critical Components for Successful Model Operation

To ensure our model operates successfully, there are a few essential components we need to focus on:

  • Evaluation Prompt:
    The Evaluation Prompt is crucial — it acts as a self-check, enabling the model to assess the output of the previous iteration against our set guidelines. We aim to keep this prompt as universal as possible to make it reusable for various tasks.
  • Improvement Prompt:
    In this stage, the model receives all the crucial inputs, the ones we want it to pay attention to during the optimization process. It’s like a roadmap highlighting the areas we want to improve.
  • Prompt History:
    To assist the Improvement LLM, we maintain a log of the best prompts along with their scores. This log is updated at every optimization cycle and is re-introduced to the model.
  • Prompt Cleaning, Parsing, and Error Handling:
    Before reusing prompts, it’s essential to clean them of all the stop-words. If not, the LLM might generate these stop-words as seen in the improvement prompt. Additionally, we extract various fields like recommendations and instructions by parsing the model’s outputs.

For our upcoming tasks, we’ll be utilizing Anthropics’ Claude models through Amazon Bedrock for both evaluation and improvement, thanks to their abilities to be steered very closely with natural language tags.

Let’s now explore how we are doing when running the optimizer for ten cycles when starting from our last prompt. All the model outputs can be found on GitHub for closer examination. The code will be released shortly after.

Let us add a bit more depth to the customer outreach, by supplying fake user data as we could find it at a bank.

user_variables = {
"first_name": "Josh",
"last_name": "Milwakee",
"age": 45,
"channel": "EMAIL",
"lang": "English",
"product_id": "L001", # Now this product ID represents a loan offer suitable for Josh's financial profile
"account_status": "Active",
"customer_since": "2005",
"monthly_income": "3200 USD", # Income from his job
"monthly_expense_average": "2800 USD", # Including rent, utilities, groceries etc.
"savings_account_balance": "10,000 USD", # Limited savings
"retirement_fund_balance": "20,000 USD", # Has started building a retirement fund
"investment_portfolio": {
"stocks": "5%", # A smaller proportion invested in stocks
"bonds": "15%", # A safer, but still small, portion in bonds
"mutual_funds": "10%", # Has invested a bit in mutual funds
},
"credit_score": "650", # Fair credit score
"home_ownership": "Rent", # Currently renting his home
"preferred_contact_time": "Evening",
"financial_goals": {
"home_purchase": True, # Aspiring to buy a home
"children_education": True, # Saving for children's education
},
"online_banking_usage": "Occasional", # Uses online banking occasionally
"relationship_manager": "John Smith", # Assigned relationship manager/banking advisor
"notes": "Interested in information on first-time home buyer programs; looking for advice on improving credit score",
}

We are going to reutilize the prior prompt template that we used in prompt iteration 2. Once again for more effortless reading, we have it here again.

Initial Prompt:

\n\nHuman:You are an marketing content creator assistant for 
bank advisors for "First National Bank".
You are assisting John Smith, a 54 years old
bank advisor working for First National Bank for 20 years.
As a respectable bank, we need to keep our marketing content
factual, as the banks reputation is at risk when making up stories.
Create a marketing {channel} to sell buy {product_name} for
customer {first_name} {last_name} who is {age} years old.
Adjust the language depending on the age of the customer
and the typical needs for customers of that age.
The product description is: {product_description}
The key features are: {product_key_features}
The product is great for: {product_great_for}
\n\nAssistant:

The evaluation and improvement prompts we are going to be using are:

Evaluation Prompt

\n\nHuman:You are a prompt optimizer and your task is to evaluate if 
the models stays within certain guardrails that we have set.
Your task is to evaluate a large language model output against
those guardrails and score them on how well they satisfy the prompt
template between the <prompt_template></prompt_template> tags.

Be very strict and rate the model output, that is marked between
the <model_output></model_output> tags on a scale from 1 - 100 and
output the score between the <score></score> tags.
Output your recommendations between <recommendation></recommendation> tags.
Output a new prompt template that incorporates your adjustments and
output those between the <new_prompt_template></new_prompt_template> tags.

Check the below model output and check if it adheres to the guardrails.
Give improvement recommendation on the prompt template.

<guardrails>
The model must stay factually correct.
The model has no knowledge about prior communication with the customer.
The banks reputation is of highest priority. We can not tolerate any
made up stories.
If the model is not staying factually correct, the score can
maximum be 25. </guardrails>
This the conversation and output of the LLM:
\n\nHuman:
<prompt_template>{template}</prompt_template>

<model_output>{content}</model_output>

Assistant:

Improvement Prompt

\n\nHuman:You are a prompt optimizer and your task is to improve 
large language model outputs via a technique called prompt engineering.
You are tasked to improve the prompt, by adjusting the prompt template
between the <INST></INST> tags and you are creating the prompt template,
not the email itself!
In the history, you will get information about how good the prompt is via
the score between the <score></score> tags, when the model is run on the
prompt template and stays between 0 and 100 (100 being the best score).

The history of the last prompts and scores is between the
<prompt_history></prompt_history> tags.
<prompt_history>{history}</prompt_history>

The following recommendations have been made to improve the current
prompt template:
<recommendation>{recommendation}</recommendation>

The current prompt template is:
<current_prompt><INST>{non_filled_prompt}</INST></current_prompt>

The guardrails are:
<guadrails>
- The model must stay factually correct.
- The banks reputation is of highest priority. We can not tolerate any
made up stories.
- If the model is not staying factually correct, the score can maximum be 25.
</guardrails>

Create <new_prompt_template></new_prompt_template> tags and only adjust
the prompt template between the <INST></INST> tags.
Make sure that you adhere to the guardrails. Furthermore, you only have
access to the user and product variables denoted between the
<variables></variables> tags, do not invent more variables!"""
+ f"""<variables>
<product_variables>["product_name","product_description",
"product_key_features","product_great_for"]</product_variables>
<user_variables>{list(user_variables.keys())}</user_variables>
</variables>

Assistant:

Let’s see what we have after three iterations of the prompt optimizer.

Prompt template after 3 Iterations:

Let's have a look at what the model has come up with between the <INST></INST> tags, after optimizing the prompt three times to sound more like a banking advisor.

<INST>
You are an AI assistant creating a factual email on behalf of First National
Bank to inform {first_name} {last_name} about the {product_name} product.
First National Bank has a strong reputation for integrity, transparency and
caution that must be upheld.

Before providing any product details, clearly state upfront that full terms,
conditions, eligibility criteria, and approval processes apply. Make no
assumptions about the customer\'s financial situation or goals.

Only use the following objective details about {product_name}:\n\n
{product_key_features}\n\nWhen composing the email:
- Use highly tentative language like "may potentially" or
"might possibly" when describing any potential product benefits.
Do not guarantee or imply anything.
- Note that more information is required from the customer to determine
if the product could potentially meet their needs. Ask what their specific
goals are without assuming them.
- Maintain a neutral, factual tone that only states details provided.
Do not make any assumptions about the customer.
- If unable to compose a factual message without assumptions, state that
more details are needed first.

Do not imply any recommendation, endorsement or evaluation of the product.
- Do not include any call to action, closing, or sender name.The email must
align with First National Bank\'s reputation for integrity,
caution and transparency. Let me know if you need any clarification.
</INST>

We can observe that the prompt template starts to add more emphasis on the reputation of the bank, and goes less in the direction of hyper-personalisation. The model does not take all the variables into account that we are supplying and is cautious of using information about the financial situation of the customer. On one hand, that is good, as the customer might have more money with other banks — here data/notes about the customer's financial situation would be very helpful to get a more rounded insight into the situation. However, if we want the model to take different paths and be for example more creative, we can steer it every few iterations with human feedback to align it more to our preference.
The LLM generated output text body after the third iteration:

Dear Josh,

As your trusted financial advisor at First National Bank for over 20 years,
I wanted to share an exciting new borrowing option that may assist you in
meeting your financial goals - our Unsecured Installment Loan.

This loan provides flexible borrowing power with predictable payments over
a set timeframe. With a transparent interest rate and no collateral required,
it can be a simple way to access funds for major purchases,
investments, or other needs.

At First National Bank, we take pride in offering honest
financial products to help our customers. Please let me know if you
would like to learn more about whether our Unsecured Installment Loan
is the right solution for you.

Sincerely,

John Smith
###END###

Already pretty good. Let’s keep this running for a few more rounds and see where this goes without any supervision.

Prompt template after 10 Iterations:

<INST>You are an marketing content creator assistant for 
First National Bank, a respectable financial institution
with a reputation for being trustworthy and factual.
You are assisting John Smith, a 54 year old bank advisor
who has worked at First National Bank for 20 years.

First National Bank is developing a new {product_name} designed for
customers like {first_name} {last_name}, who is {age} years old.

As a bank, it is critical that we remain factual in our marketing
and do not make up any false claims. \n\nPlease write a {channel}
marketing message to promote our new {product_name} to {first_name}.

The goal is to highlight the key features and benefits of {product_name}
in a way that resonates with customers like {first_name}.

When writing this {channel} message, please adhere to the
following guidelines:
- Only use factual information provided in the product description.
Do not make up any additional features or benefits.
The product description is: {product_description}
- Emphasize how {product_name} can help meet the typical financial
needs of customers like {first_name}, based on their age demographic.
- Use a warm, helpful tone that builds trust and
demonstrates how First National Bank can assist customers like {first_name}.
- If unable to compose a factual {channel} message for
{first_name}, indicate that more information is needed.

As a respectable bank, First National Bank's reputation relies
on marketing content that is honest, accurate and trustworthy.
Please ensure the {channel} message aligns with our brand identity.
Let me know if you need any clarification or have concerns about
staying factually correct.\n</INST>

Output of the executed prompt template after ten iterations:

Dear Josh Milwakee,

I am reaching out to provide initial information about
First National Bank's Unsecured Installment Loan.

Please note that full terms, conditions, eligibility criteria,
and approval processes apply. More details from you are needed to
fully evaluate if this product meets your needs.

The Unsecured Installment Loan allows you to borrow anywhere from
$2,500 to $25,000 as a one-time, lump-sum distribution of loan proceeds.
It has a fixed repayment schedule and is available with a fixed interest rate.

You may qualify for an interest rate discount when you set up automatic
monthly payments from a First National Bank deposit account.
Repayment terms are flexible.

Please let me know your specific goals so I can better assess
if this product is suitable. I look forward to hearing from you.

Regards

John Smith

Now we have reached a very elaborate prompt template, that really emphasizes the needs of a marketing prompt in the financial industry. The only drawback that we can see is, that after ten iterations, the model has come up with a prompt template that sells a “novel” product, even though we didn’t set out to do so.

Conclusion

Throughout this article, we’ve delved into the practical application of prompt engineering, illustrating its potency through a hypothetical scenario involving bank marketing. We examined the capabilities of LLMs to assess and refine their own prompt templates through successive rounds, ensuring enhanced customization in email communications.
However, the journey doesn’t end here; there are numerous avenues left to explore and optimize. A user-friendly interface is paramount, enabling us to pause the optimization process and converse with the optimizer, providing real-time feedback on ongoing developments. Additionally, by testing against a broader customer base, we can further ascertain the effectiveness of our prompt templates, ensuring their resilience and adaptability across diverse customer segments. Lastly, the entire marketing pipeline from content generation to content distribution can be automated by integrating with AWS Communications Developer Services such as Amazon Pinpoint or Amazon Simple Email Service to enable the complete journey in one simple interface. Stay tuned if you are interested in Infrastructure as Code to spin up such a solution.

Further Exploration and Contributions:

For those interested in delving deeper into the optimizations and seeking to contribute, all iterations of the optimizer are accessible on our GitHub repository, and we anticipate releasing the associated code soon.
For more details on how/where a prompt engineering/optimization module can be integrated into your marketing use cases, check out our other blog Building Generative AI into Marketing Strategies: A Primer.
This endeavor serves as a stepping stone towards harnessing the full potential of prompt engineering in creating more nuanced and personalized user experiences, and we look forward to the innovations that will stem from these foundational explorations.

--

--

Philipp Kaindl

Sr. AI/ML Specialist Solutions Architect@AWS. Opinions are my own.